Search CORE

80 research outputs found

Bandwidth extension of narrowband speech

Author: Expósito Pérez Miquel
Salavedra Molí Josep
Publication venue: Universidad Politécnica de Valencia
Publication date: 01/01/2014
Field of study

Recently, 4G mobile phone systems have been designed to process wideband speech signals whose sampling frequency is 16 kHz. However, most part of mobile and classical phone network, and current 3G mobile phones, still process narrowband speech signals whose sampling frequency is 8 kHz. During next future, all these systems must be living together. Therefore, sometimes a wideband speech signal (with a bandwidth up to 7,2 kHz) should be estimated from an available narrowband one (whose frequency band is 300-3400 Hz). In this work, different techniques of audio bandwidth extension have been implemented and evaluated. First, a simple non-model-based algorithm (interpolation algorithm) has been implemented. Second, a model-based algorithm (linear mapping) have been designed and evaluated in comparison to previous one. Several CMOS (Comparison Mean Opinion Score) [6] listening tests show that performance of Linear Mapping algorithm clearly overcomes the other one. Results of these tests are very close to those corresponding to original wideband speech signal.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Codificación APVQ de voz en banda ancha usando asignación dinámica de bits

Author: Salavedra Molí Josep
Publication venue: 'Universidad de Valladolid'
Publication date: 01/01/1995
Field of study

This paper describes a coding scheme for broadband speech. It can be seen as a vectorial extension of a conventional ADPCM encoder. In this scheme, signal vector is formed with one sample of the normalized prediction error of each subband and then it is vector quantized. It combines the advantages of the scalar prediction and those of vector quantization (VQ). We handle the high vector dimensionality by using a multiVQ. It requires a previous subvector division and an adequate bit assignment among them. This scheme shows a high capacity to drive large dynamic range signals like broadband speech. Predictor and codebook dessigns are discussed. Some results about speech prediction and coding are reported.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

APVQ encoder applied to wideband speech coding

Author: Masgrau Gómez Enrique José
Salavedra Molí Josep
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1996
Field of study

The paper describes a coding scheme for broadband speech (sampling frequency 16 KHz). The authors present a wideband speech encoder called APVQ (adaptive predictive vector quantization). It combines subband coding, vector quantization and adaptive prediction. The speech signal is split into 16 subbands by means of a QMF filter bank and so every subband is 500 Hz wide. This APVQ encoder can be seen as a vectorial extension of a conventional ADPCM encoder. In this scheme, signal vector is formed with one sample of the normalized prediction error signal coming from different subbands and then it is vector quantized. The prediction error signal is normalized by its gain and normalized prediction error signal is the input of the VQ and therefore an adaptive gain-shape VQ is considered. This APVQ encoder combines the advantages of scalar prediction and those of vector quantization. They evaluate wideband speech coding in the range from 1.5 to 2 bits/sample, that leads to a coding rate from 24 to 32 kbps.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Codificación APVQ de voz en banda ancha para velocidades entre 16 y 32 KBPS

Author: Masgrau Gómez Enrique José
Salavedra Molí Josep
Publication venue
Publication date: 01/01/1996
Field of study

This paper describes a coding scheme for broadband speech (sampling frequency 16KHz). We present a wideband speech encoder called APVQ (Adaptive Predictive Vector Quantization). It combines Subband Coding, Vector Quantization and Adaptive Prediction as it is represented in Fig. I. Speech signal is split in 16 subbands by means of a QMF filter bank and so every subband is 500Hz wide. This APVQ encoder can be seen as a vectorial extension of a conventional ADPCM encoder. In this scheme, signal vector is formed with one sample of the normalized prediction error signal coming from different subbands and then it is vector quantized. Prediction error signal is normalized by its gain and normalized prediction error signal is the input of the VQ and therefore an adaptive Gain-Shape VQ is considered. This APVQ Encoder combines the advantages of Scalar Prediction and those of Vector Quantization. We evaluate wideband speech coding in the range from 1 to 2 bits/sample, that leads to a coding rate from 16 to 32 kbps.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Codificación APVQ-extendida de voz de banda ancha

Author: Masgrau Gómez Enrique José
Salavedra Molí Josep
Publication venue
Publication date: 01/01/1994
Field of study

This paper describes a coding scheme for broadband speech. It can be seen as a vectorial extension of an conventional ADPCM encoder. In this scheme, the vector signal is formed with one sample of the normalizaed prediction error of each subband and then, it is vector quantized. It combines the advantages of the scalar prediction and of the vector quantization (VQ) . We handle the high vector dimensionality by using a multi-VQ. It requires a previous subvector division and an adequate bit assignement among them. This scheme shows an high capacity to drive large dynamic range signals like broadband speech.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Third-order cumulant-based wiener filtering algorithm applied to robust speech recognition

Author: Hernando Pericás Francisco Javier
Salavedra Molí Josep
Publication venue
Publication date: 01/01/1996
Field of study

In previous works [5], [6], we studied some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the authors: a cumulant-based Wiener Filtering (AR3_IF) is applied to Robust Speech Recognition. A low complexity approach of this algorithm is tested in presence of bathroom water noise and its performance is compared to classical Spectral Subtraction method. Some results are presented when training task of the speech recognition system (HTK-MFCC) is executed under clean and noisy conditions. These results show a lower sensitivity to the presence of water noise when applying AR3_IF algorithm inside of a speech recognition task.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Comparison of different order cumulants in a speech enhancement system by adaptive Wiener filtering

Author: Masgrau Gómez Enrique José
Moreno Bilbao M. Asunción
Salavedra Molí Josep
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1993
Field of study

The authors study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim and Oppenheim (1978), where the AR spectral estimation of the speech is carried out using a second-order analysis. But in their algorithms the authors consider an AR estimation by means of a cumulant (third- and fourth-order) analysis. The authors provide a behavior comparison between the cumulant algorithms and the classical autocorrelation one. Some results are presented considering the noise (additive white Gaussian noises) that allows the best improvement and those noises (diesel engine and reactor noise) that leads to the worst one. And exhaustive empirical test shows that cumulant algorithms outperform the original autocorrelation algorithm, specially at low SNR.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Some robust speech enhancement techniques using higher order AR estimation

Author: Masgrau Gómez Enrique José
Moreno Bilbao M. Asunción
Salavedra Molí Josep
Publication venue
Publication date: 01/01/1994
Field of study

Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Speech enhancement by adaptive wiener filtering based on cumulant ar modelling

Author: Masgrau Gómez Enrique José
Moreno Bilbao M. Asunción
Salavedra Molí Josep
Publication venue
Publication date: 01/01/1992
Field of study

Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Predicción lineal de la parte causal de la autocorrelación para la identificación del locutor en ambientes ruidosos

Author: Hernando Pericás Francisco Javier
Nadeu Camprubí Climent
Salavedra Molí Josep
Villagrasa C
Publication venue
Publication date: 01/01/1994
Field of study

Recently, a new parametrization technique based on the AR modelling of the one-sided autocorrelation sequence (OSALPC) has shown to be attractive for speech recognition because of its simplicity and its high recognition perfomance in noisy conditions. In this paper, that new parametrization technique is proposed to speaker identification in noisy enviroment. Experimental results obtained with a new speaker identification system based on the statistics of the cepstrals vectors show that OSALPC also achieves much better results than standard parametrization techniques.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC